Tastes Great, Less Filling: Low-Impact OLAP MapReduce Queries on High-Performance OLTP Systems
نویسندگان
چکیده
The previous decade saw the rise of separate, dedicated database management systems (DBMS) for online transaction processing (OLTP) and online analytical processing (OLAP) workloads [3]. The former are focused on executing short-lived, small-footprint transactions with high throughput and strong consistency guarantees. OLAP DBMSs typically target longer running and more complex queries that examine the database after it is offloaded from the front-end OLTP DBMS. For many, the latency overhead of transferring data between these two systems, as well as their administrative costs, is too onerous. A burgeoning alternative is to use a hybrid approach where an OLTP system is able execute OLAP-style queries alongside the transactional workload [1]. This provides users the ability to execute business intelligence and other analytical queries in “real-time” (i.e., without waiting for data to be copied to the OLAP system). Such an approach has its own drawbacks, however, especially in a clustered environment. If the data is spread across multiple machines, then the OLAP queries must be executed as heavy-weight distributed transactions, which are well-known to significantly reduce the overall throughput of an OLTP system [2]. To overcome this problem, we propose a novel method of executing OLAP workloads on a parallel OLTP DBMS using the MapReduce programming model. OLAP queries are decomposed into map and reduce operations that are executed as separate transactions with either strong or weak consistency guarantees across the cluster. We implemented this model in the H-Store OLTP system [2] and evaluated it with a mixed OLTP/OLAP workload [1]. BODY OLAP queries executed as MapReduce jobs in an OLTP DBMS achieve same latency as distributed transactions but improve throughput by 20%–50%
منابع مشابه
HyPer: HYbrid OLTP&OLAP High PERformance Database System
The two areas of online transaction processing (OLTP) and online analytical processing (OLAP) present different challenges for database architectures. Currently, customers with high rates of mission-critical transactions have split their data into two separate systems, one database for OLTP and one so-called data warehouse for OLAP. While allowing for decent transaction rates, this separation h...
متن کاملHyPer: Adapting Columnar Main-Memory Data Management for Transactional AND Query Processing
Traditionally, business applications have separated their data into an OLTP data store for high throughput transaction processing and a data warehouse for complex query processing. This separation bears severe maintenance and data consistency disadvantages. Two emerging hardware trends allow the consolidation of the two disparate workloads onto the same database state on one system: the increas...
متن کاملHyPer-sonic Combined Transaction AND Query Processing
In this demo we will prove that it is – against common belief – indeed possible to build a main-memory database system that achieves world-record transaction processing throughput and best-of-breed OLAP query response times in one system in parallel on the same database state. The two workloads of online transaction processing (OLTP) and online analytical processing (OLAP) present different cha...
متن کاملBenchmarking Hybrid OLTP&OLAP Database Systems
Recently, the case has been made for operational or real-time Business Intelligence (BI). As the traditional separation into OLTP database and OLAP data warehouse obviously incurs severe latency disadvantages for operational BI, hybrid OLTP&OLAP database systems are being developed. The advent of the first generation of such hybrid OLTP&OLAP database systems requires means to characterize their...
متن کاملParallel Replication across Formats in SAP HANA for Scaling Out Mixed OLTP/OLAP Workloads
Modern in-memory database systems are facing the need of efficiently supporting mixed workloads of OLTP and OLAP. A conventional approach to this requirement is to rely on ETL-style, application-driven data replication between two very different OLTP and OLAP systems, sacrificing realtime reporting on operational data. An alternative approach is to run OLTP and OLAP workloads in a single machin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TinyToCS
دوره 1 شماره
صفحات -
تاریخ انتشار 2012